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The performance of a D-Wave Vesuvius quantum annealing processor was recently compared to a 
suite of classical algorithms on a class of constraint satisfaction instances based on frustrated loops. 

However, the construction of these instances leads the maximum coupling strength to increase with 
problem size. As a result, larger instances are subject to amplified analog control error, and are 
effectively annealed at higher temperatures in both hardware and software. We generate similar 
constraint satisfaction instances with limited range of coupling strength and perform a similar 
comparison to classical algorithms. On these instances the D-Wave Vesuvius processor, run with 
a fixed 20jis anneal time, shows a scaling advantage in the median case over the software solvers 
for the hardest regime studied. This scaling advantage implies that quantum speedup is not ruled 
out for these problems. Our results support the hypothesis that performance of D-Wave Vesuvius 
processors is strongly influenced by analog control error, which can be reduced and mitigated as the 
technology matures. 


I. INTRODUCTION 

Following the recent introduction of D-Wave quantum 
annealing processors, a wealth of research has aimed to 
characterize the performance of this new platform, in 
particular pitting it against classical competition MU- 
D-Wave processors take as input spin glass instances in 
the Ising model, and it is straightforward to express a 
variety of NP-hard problems in this format [T^]. How¬ 
ever, the energy landscape of some instances may be more 
amenable to solution by thermal or combinatorial meth¬ 
ods than quantum methods mm, and input to current 
D-Wave processors must be reasonably robust to analog 
control error if we are to observe the mechanics of the un¬ 
derlying quantum annealing platform rather than classi¬ 
cal noise Eicna. The selection of appropriate testbeds to 
use when probing for quantum speedup has recently been 
the subject of much research. This research has identi¬ 
fied several desirable properties of input sets, including 
the existence of a nonzero-temperature spin glass phase 
transition m, foreknowledge of the ground state energy 
and possibly ground states p] , tunable difficulty, and ro¬ 
bustness to analog control error and thermal effects m- 

Randomly generated instances of constraint satisfac¬ 
tion problems (CSPs) or satisfiability problems are an 
attractive target: they are well-understood from a sta¬ 
tistical physics perspective, and their difficulty can be 
tuned by a single parameter: the constraint-to-variable 
ratio a mi- However, direct solution of these instances 
requires, in general, the ability to couple arbitrary pairs 
of qubits in the processor. While this can be done indi¬ 
rectly in a D-Wave processor through creation of logical 
qubits OUZHUj, this may amplify control error and ob¬ 
scure the underlying mechanics of the processor [3]. 

Hen [2Tj managed this issue by constructing constraint 
satisfaction problems that can be directly embedded in 
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an arbitrary qubit connectivity graph; each constraint 
is a frustrated loop , i.e. a cycle of couplers of which an 
odd number are antiferromagnetic. These problems have 
two desirable properties: a planted (foreknown) ground 
state and difficulty that can be tuned with the parame¬ 
ter a. Hen et al. (I) show that performance scaling for 
the D-Wave processor is superior to the best of a suite 
of classical solvers in one region of a, but is worse in the 
region of a encompassing the hardest instances. How¬ 
ever, their instances are constructed in such a way that 
for a fixed value of a , thermal effects and analog error are 
increasingly amplified by normalization as the problems 
increase in size. 

Here we present a simple modification of the construc¬ 
tion of these instances that curtails this effect, putting 
the analog and digital solvers on more level ground and 
reducing unwanted thermal behavior. On these range- 
limited instances, we find that a D-Wave Vesuvius pro¬ 
cessor shows better performance scaling than classical 
competition for all values of a tested. This competi¬ 
tion consists of the two best-performing classical soft¬ 
ware solvers studied by Hen et al.: the zero-temperature 
Hamze-de Freitas-Selby (HFS) algorithm \T3fTo\ . and a 
solver version of simulated annealing (SAS) |26j . Hen 
et al. also showed very strong correlation between suc¬ 
cess probabilities in the D-Wave processor and a thermal 
Gibbs state approximated using standard simulated an¬ 
nealing (SAA) [1]. Moderating the coupling range of the 
input instances, and therefore the temperature relative to 
the final gap of the time-dependent Hamiltonian, reduces 
correlation with the thermal model. 


II. QUANTUM ANNEALING AND THE 
D-WAVE PLATFORM 

Quantum annealing in the Ising model aims to find 
low-energy states in a system of n interacting spins via 
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evolution of the time-dependent Hamiltonian 

■H S {t) = \Y J A (t)°*+Bm I a) 

i 

where 0<i<f/,i/is the run time of the QA al¬ 
gorithm, A(0) » B( 0), A(tf) <C B(tf) and Hi is the 
time-independent Ising problem Hamiltonian: 

= Jijcrla~ + hiaf . (2) 

i<j i 

We refer to the Ising Hamiltonian as Hi = (h, «/), where 
the biases h and pairwise couplings J encode the opti¬ 
mization problem (i.e. energy function) we wish to solve 
(i.e. minimize). 

In a D-Wave quantum annealing processor E3HEI , not 
all pairs of qubits are coupled, and therefore the set of 
nonzero entries of J must adhere to the physical con¬ 
straints of the processor. One can view (h, J) as a set of 
vertex and edge weights, respectively, of the qubit con¬ 
nectivity graph , whose vertices correspond to qubits and 
whose edges correspond to couplers. 

III. FRUSTRATED LOOP PROBLEMS AND 
LIMITED COUPLING RANGE 

For a particular hardware qubit connectivity graph G 
with n vertices (qubits), and a particular constraint-to- 
qubit ratio a, Hen et al. construct a frustrated loop in¬ 
stance using k = roundoffian) loops like so: 

1. For i = 1,..., k, loop G is a cycle in G chosen as the 
first cycle generated by the edges of a random walk 
in G starting at a random vertex. If ti contains 
fewer than 8 vertices, it is discarded and generated 
anew. 

2. The constraint Ising Hamiltonian Jj corresponding 
to ti has value —1 on every edge of £j except for a 
randomly selected edge of £i, where has value 
+1. Ji is zero elsewhere. 

3. The final Ising Hamiltonian is (h, J), where h is the 
zero vector and J = Yl= i J»- 

Any instance constructed with this method has integer¬ 
valued h and J, but the coupling range R = max,., {| ,7j ; 1}, 
i.e. the maximum magnitude of any entry in J, is not 
necessarily bounded. Moreover, typical instances con¬ 
structed at a fixed ratio a on increasingly large subgrids 
of the D-Wave processor have increasing range limits R 
DESj. Since input (h, J) to the D-Wave processor must 
be normalized to within the range [—1,1], coupling range 
R necessitates scaling by a factor of 1 /R. This scale fac¬ 
tor creates two complications when studying the efficacy 
of the quantum annealing algorithm on practical hard¬ 
ware. First, the operating temperature of the processor 
relative to the magnitude of the input increases with R , 


thus increasing undesirable thermal effects. Second, each 
coupler and local held is subject to analog control error 
on the order of SJ ~ 0.035 and Sh ~ 0.05 respectively. 
The magnitude of the errors Sh and SJ are relative to 
normalized full energy scale J = 1. Note that errors 
are present even for h = 0 or J = 0. For scale factors 
of R, analog control error relative to the magnitude of 
the desired input is increased by a factor of R. Thus, 
the deviation of the actual input from the desired input 
increases with increasing R. In the range-unlimited in¬ 
stances studied by Hen et al. this amplification factor is 
as high as 17, and can be dictated by a single coupler that 
happens to be in disproportionately many loops. Since 
R grows with instance size n, the D-Wave processor is 
penalized on larger instances. 

In order to address this issue, we construct each in¬ 
stance with respect to an integer coupling range R > 2, 
so that in our instances each entry of h and J is an in¬ 
teger between —R and R. To do this, when selecting 
a candidate for A via random walk we ignore edges of 
G on which | J*| is already R. This ensures that 

the final Hamiltonian ( h , J) has all entries in the range 
[— R,R], so when the instance is necessarily normalized 
to the range [—1,1] as input to the D-Wave processor, it 
is scaled down by no more than a factor of 1/R. Con¬ 
sequently, analog control errors and thermal effects in 
our instances are relatively amplified by no more than a 
factor of R where R is independent of instance size n. 

There is another, less crucial modification of the con¬ 
struction: While Hen et al. reject and resample a choice 
of £j if it is too short, we reject the choice if it is contained 
in a single eight-qubit unit cell [28j . Thus we sometimes 
allow loops of length 6, and sometimes forbid loops of 
length 8. This modification should in principle allow for 
greater frustration and less domain clustering within unit 
cells. 

In any meaningful study of analog quantum annealing 
processors it is desirable to limit relative amplification 
of analog control error and unwanted thermal effects if it 
can be done without otherwise materially detracting from 
the experiment. In this work we consider R £ {2,3,oo} 
and a £ [0.1,0.5]; these values of a include the hardest 
regime. Implications of our choice of coupling range and 
loop selection criteria are considered in greater detail in 
the Supplemental Material [25] , 

All of these frustrated loop instances will, by construc¬ 
tion, have ft ’'' t an d -14 • • • i as planted ground states. 
Hen et al. describe these instances as being constructed 
with respect to an arbitrary antipodal pair of planted so¬ 
lutions, but our construction is equivalent under change 
of variables (Ising spin reversal) both in theory and, due 
to the application of random spin reversals in hardware, 
in practice. 
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IV. EXPERIMENTAL RESULTS 


We compare performance of a D-Wave quantum an¬ 
nealing processor to HFS and SAS, both described by 
Hen et al. I]. Our instances are constructed on sub¬ 
graphs of a Chimera graph Cl |2Hj, in which n of N = 8 L 2 
qubits are functional, for 4 < L < 8 (see Supplemental 
Material [221 )■ Following their methodology, we run SAS 
on a linear schedule of optimal length in inverse temper¬ 
ature spanning the range 0 £ [0.01,5] after scaling the 
input to the range [—1,1], Scaling of the Hamiltonian has 
no bearing on HFS, which is a large-neighborhood zero- 
temperature search that exploits low-treewidth induced 
subgraphs in the Chimera architecture. 

In order to remain consistent with previous probes for 
quantum speedup mm, we assume that classical algo¬ 
rithms are run by a perfectly parallel oracle that allows 
all n sites to be updated simultaneously in simulated an¬ 
nealing, and allows all possible cell updates in HFS to 
be performed in parallel. Going even further, we simply 
divide SAS running time by n and divide HFS running 
time by L = N/8. the maximum possible number of 

parallel cell updates at any point in the algorithm. Fur¬ 
ther detail on experimental methods, benchmarking and 
data analysis is given in the Supplemental Material [29]. 
To account for differences between implementations and 
hardware, we use the assumption of Hen et al. that each 
Monte Carlo sweep takes time tsa = 3.54ps. We assume 
that each HFS unit cell update takes lps. 

The D-Wave processor used was a D-Wave Two V6 
processor of the same architecture and fabrication lot as 
the processor used by Hen et al. p] 


A. Performance scaling results 

In Fig. |T] we show the scaling of the median time to 
solution for the three solvers as the problem size L in¬ 
creases. As in previous work mm, we are particularly in¬ 
terested in how the ratio between two solvers’ time to so¬ 
lution scales with respect to problem size. This is shown 
in Fig. [2] A positive slope in Fig. [2] indicates a perfor¬ 
mance scaling advantage for the D-Wave processor, and 
the possibility of limited quantum speedup as defined by 
Rpnnow et al. [4] in the case of SAS, and the possibil¬ 
ity of potential quantum speedup in the case of HFf0 In 
the Supplemental Material [25] we arrange the data for 
range-2, range-3, and range-unlimited instances by a. 

D-Wave Vesuvius processors allow a minimum anneal 
time of 20ps; previous work has shown that Cs-scale 
problems with optimal anneal time greater than this are 
elusive miaii!- Proving limited quantum speedup in this 


1 The distinction arises because HFS is a combinatorial algorithm 
rather than one based on a physical model pQ. 


D-Wave, range 2 HFS, range 2 SAS, range 2 





D-Wave, range 3 


HFS, range 3 
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FIG. 1. Median time to solution per size. Shown is the 
median time to solution for the D-Wave processor (left), HFS 
(middle), and SAS (right). The top and bottom rows show 
data for range-2 and range-3 instances respectively. Following 
Hen et al. [1], we divide HFS times by L = N/8 to simulate 

hypothetical parallelization. SAS data incorporates full n- 
core hypothetical parallelization. Error bars represent one 
standard deviation from bootstrap samples; most are smaller 
than the data markers. 


framework would require data from the D-Wave proces¬ 
sor using shorter anneals to certify that we are not ar¬ 
tificially slowing the processor on easier instances. In 
particular for the smaller and easier problems, the mini¬ 
mum anneal time may mask the true performance scaling 
of the quantum annealing platform him!. This may ex¬ 
plain, to some extent, the outstanding performance of 
the D-Wave processor on high-a instances. 


In the Supplemental Material [29] we analyze the effect 
that range limitation has on the difficulty of the prob¬ 
lems. Here we simply note that for the hardest range of 
a, limiting the range of instances to 3 does not seem to 
make the problems significantly easier. This can be seen 
where range-2, range-3, and range-unlimited instances 
are compared for the available solvers. 
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FIG. 2. Median ratio of running time by size. Shown is 
the median ratio of time to solution for the D-Wave processor 
compared with each classical solver. Positive slope represents 
an increasing advantage for the D-Wave processor as problems 
get larger. The hardest overall regime roughly corresponds to 
a « 0.25. Error bars represent one standard deviation from 
bootstrap samples. 


B. Comparison with a nearly equilibrated thermal 
annealer 


probabilities. The predictions of the thermal model do 
not correlate well with the hardware on the range 2 and 
range 3 instances. 

V. DISCUSSION 

Far from being artificial, fixed coupling range in the 
large system limit appears, at the phase transition in the 
Ising formulation of Not-All-Equal 3-SAT [310 which 
is NP-hard and can be formulated as a frustrated loop 
problem on the complete graph. Furthermore, limiting 
coupling range in hard frustrated loop instances affects 
a vanishingly small portion of each instance (see Supple¬ 
mental Material [2911. 

The study of instances with fixed coupling range allows 
for the control of two factors: amplification of analog 
control error (for D-Wave) and effective operating tem¬ 
perature (for D-Wave, SAS, SAA, and any other simu¬ 
lated physical model with a thermal component []] ). Our 
results, taken in conjunction with those from Ref. m , in¬ 
dicate a decreasing advantage for the D-Wave hardware 
relative to SAS with increasing range. This observation 
is consistent with the hypothesis that increasing range 
penalizes the hardware by augmenting both the relative 
magnitude of control errors and the importance of ther- 
malization. 

It is straightforward to construct input classes for 
which analog control error will dominate the performance 
scaling of an analog processor. When probing the poten¬ 
tial for quantum speedup in a quantum annealing plat¬ 
form, it is important to do the opposite: construct an 
input class for which the impact of analog control error 
is minimal. In doing so we might better observe proper¬ 
ties of the annealer’s mechanics rather than observing the 
effect of precision limitations, which by now are reason¬ 
ably well understood m eiesi and expected to improve 
with the maturation of the technology and the possible 
implementation of error correction strategies [20] 1221 - 125] . 


ACKNOWLEDGMENTS 


Hen et al. found strong correlation between the success 
probabilities of a D-Wave processor and a nearly equili¬ 
brated thermal annealer with a final inverse temperature 
of /3f = 5. Our results (see Fig.[3]and Supplemental Ma¬ 
terial [29]) show poorer correlation and an inability to fit 
D-Wave scaling data to SAA at a single inverse temper¬ 
ature. Unlike the range-unlimited instances studied in 
[lj, the hardness peak for SAA does not remain constant 
with varying f3f. In the Supplemental Material we show 
instancewise scatter plots of D-Wave and SAA success 


The authors thank Tameem Albash, Itay Hen, Joshua 
Job, and Daniel Lidar for a detailed and informative 
exchange on this work and theirs, and for generous 
provision of data. They thank Evgeny Andriyash and 
Jack Raymond for fruitful discussions about frustrated 
loops, and Emile Hoskinson for providing specifications 
of the D-Wave processor used. They thank Mohammad 
Amin and Miles Steininger for valuable comments on the 
manuscript. 


2 The expected number of constraints containing a given pair of 
variables is approximately 12.6/n at the phase transition for 
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Scaling coefficients on range-2 instances Scaling coefficients on range-3 instances 




FIG. 3. Scaling coefficients for the D-Wave processor and SAA. Shown are scaling coefficients (exponential slope fits, 
i.e. time to solution oc exp(fo(a)L)) for performance of the D-Wave processor and SAA on range-2 and range-3 instances. Hen 
et al. find agreement between the D-Wave processor and SAA at /3/ = 5. Here each final inverse temperature /3/ € {3,4,5} 
appears deficient in some regime as a model for performance of the D-Wave processor. Error bars represent two standard 
deviations from the bootstrap set. 
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I. METHODS 

A. Quantum annealing platform 

In this work we used a Vesuvius quantum annealing 
processor manufactured by D-Wave Systems Inc. The 
processor is of identical design to the D-Wave Two V6 
processor installed at ISI [1], and is from the same fab¬ 
rication lot. We call these two processors “SR10-V6” 
and “ISI-V6” respectively. The problems we consider are 
generated on subgraphs of the processor’s Chimera qubit 
connectivity graph [28], using up to 467 qubits (see Fig. 
5). The data were gathered in June, 2014. SR10-V6 had 
an operating temperature of approximately 15 mK, and 
used maximum J inductance (coupling strength) of 1.25 
pH, compared with temperature and inductance of 17mK 
and 1.33pH for ISI-V6 [1]. All experiments were run us¬ 
ing an anneal length of t a = 20ps, equal to the runs used 
for most of the key analysis in the work of Hen et al. [ ]. 
Although the processors have the same architecture and 



FIG. 5. The largest subprocessor used, a partial Cg graph - 
an 8 x 8 grid of unit cells - with 467 qubits. Smaller instances 
use the square subgrid containing the top-left corner. 
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t/tf 


FIG. 6. Annealing schedule of the D-Wave processor. 

A(t) and B(t) represent the transverse field and longitudinal 
couplings, respectively. Operating temperature of 15mK is 
shown. Scale is normalized to h = 1. 


similar ratios of temperature to inductance, they used 
different annealing schedules (see Fig. 6 and Ref. [1]). 


B. Experimental details 

The primary testbed of problems studied consists of 
200 instances of each size L £ {4, 5, 6, 7,8}, for each value 
of a £ {0.10,0.15,..., 0.50}, for each range limit R £ 
{2,3, oo}. Data were not collected using the hardware for 
R = oo, as the processor was taken offline in 2014 before 
the need for such results was apparent. In Section IIB we 
provide evidence that such data would likely be similar to 
the results in Ref. [1], where quantum annealing success 
probabilities are highly correlated with themselves when 
run on a different annealing schedule. 


1. D-Wave processor 

Each instance was annealed 10240 times by the D- 
Wave processor with an anneal length of 20ps, the min¬ 
imum allowed by the system. These experiments were 
performed in batches of 1024 anneals, each batch with a 
random Ising spin reversal, or gauge transformation ap¬ 
plied, as in previous work [4, 32]. 
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2. HFS 

Selby’s implementation of the HFS algorithm [ 1 , 24, 25] 
is a heuristic approach whose main effort consists of tree 
updates , and which typically restarts to a random state 
when it performs two tree updates without improvement 
in energy. In order to treat HFS more like an annealer, 
we modified the code so that a solution is recorded before 
each reset - as in SAS, the original algorithm is modified 
to return possibly optimal intermediate states. It would 
be algorithmically equivalent to force the HFS algorithm 
to terminate instead of resetting. 

In our experiments, each instance was solved 10240 
times by the HFS algorithm. The runs were performed 
single-threaded in parallel using an HPC cluster of 8 -core 
Intel Xeon E5-2670 processors. 


3. SAS 

SAS was run on each instance 1024 times at anneal 
length roundoff { 2 a / 2 ) for each integer a £ { 2 ,..., 22 }. 
Thus the longest anneal for each instance is 2048 Monte 
Carlo sweeps, enough so that the optimal anneal length 
for median time to solution is exceeded for all problem 
sets studied (see Fig. 7). 


4. SAA 

SAA was run on each instance 1024 times for 20,000 
Monte Carlo sweeps. For similar instances, Hen et al. 
support the claim that 20,000 gets us reasonably close 
to equilibration (of performance as a constant-time op¬ 
timizer) for frustrated loop problems. We did not inves¬ 
tigate longer SAA anneals due to limited availability of 
computing resources. Like SAS, SAA was run on a lin¬ 
ear schedule in inverse temperature from (3 0 = 0.01 to 
Pf e {3,4,5}. 


C. Benchmarking methodology 

Following previous work [1, 4, 32], we treat each solver 
(now including HFS) as a stochastic sampler, and mea¬ 
sure the time to reach 99% confidence of having found 
the ground state energy of an instance. We call this time 
to solution (TTS). Given a solver achieving success prob¬ 
ability p over a set of trials (i.e. anneals) and taking mean 
time r to complete a single trial, we compute the number 
of samples required as TTS for this solver and instance 
as 


r(p) 


log( 0 . 01 ) 
log(l -p)' 


(1) 


and compute TTS for this solver and instances as rr. For 
SAS we assume r to be 3.54ps multiplied by the number 


of update sweeps in the anneal in order to remain consis¬ 
tent with Hen et al. [ ]. We determine the optimal sweep 
length for each set of 200 problems as the length giving 
the minimum bootstrapped median time to solution. We 
disregard the issue of conditioning our SAS results on 
minimizing TTS; due to the smoothness of the curves in 
Fig. 7 near the optimal anneal lengths, we do not expect 
this to have a significant effect on the results. 

For HFS our methodology differs from that of Hen et 
al.: We use an enumerative effort computation, as with 
the D-Wave processor and SAS, rather than timing the 
process. This allows a bare look at the dominant op¬ 
erations. Following Hen et al., we assume hypothetical 
parallelization of L cores on a Cl instance, and accord¬ 
ingly assume that a tree update on a Cl instance takes 
0{L ) parallel steps (actually L steps) of “leaf updates”. 
Using these assumptions we compute the total number 
of leaf update steps required for a given “anneal” (i.e. 
sample draw) and for convenience we assume a constant 
of L • lps per tree update, noting that this gives reason¬ 
ably comparable performance at C 4 scale to the results 
of Hen et al. [1]. 


1. Statistical methods 

To generate the data points and error bars in the per¬ 
formance and speedup figures, we used the same Bayesian 
bootstrapping approach used by Hen et al. [1]. First we 
describe the approach for performance data, which differs 
slightly for HFS. 

For a given solver, SR10-V6 or SAS or SAA, and each 
set S of 200 instances at a given value of (L, a), we have, 
for each instance Si, an empirical success probability pi 
representing Xi successes out of y trials. We consider 
the probability distribution of success probability to be 
/% = 0{xi + \,y — Xi + |). We then construct 1000 
bootstrap sets Sj of size 100 by drawing 200 members 
from S with repetition, resulting in multisets Sj = {sjj | 
1 < * < 100} j. Now for each set we sample a probability 
p hl from distribution fSjj. 

At this point we can apply the desired function fj to 
the set of 100 probabilities, which is typically the median 
of {r(pij)} for a fixed j. We then take the data point to 
be the mean of fj, and we take the error in the statistic 
to be the standard deviation of fj. 

For HFS we use a similar approach suggested by 
Joshua Job (personal communication, January 2015). 
After we sample 1000 values indicating the number r, of 
samples needed for 99% assuredness of success, we take 
\rf\ random samples (with repetition) from our set of 
10240 samples, and take the sum of the numbers of tree 
updates in those |~r,] samples to give a sample of time to 
solution. 

Speedup. To compute data on speedup between two 
solvers (or equally, the same solver on two problem sets), 
we assume that time to solution is normally distributed 
with mean and standard deviation as estimated above. 
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FIG. 7. SAS time to solution vs. anneal length for Cg instances. Shown here are SAS results for R £ {2,3, oo} on 
Cg problems. Note that all panels are qualitatively similar. Behavior at the left of each panel is a statistical artifact arising 
from our Bayesian model and limited number of experiments, which effectively places a lower bound on the success probability 
(cf. the data of Hen et al. on suboptimal annealing time [1]). The slight increase in difficulty for the hardest problems as R 
increases can be explained by higher temperature relative to the gap of the final Hamiltonian. The increase in difficulty for 
long anneals in high-a problems as R increases can be explained by the suppression of randomness (and accelerated evolution 
of ferromagnetic structure) for low-A, high-a combinations (see Appendix IIB). 


We then sample 1000 points from each normal distribu¬ 
tion and compute 1000 speedup ratios. We use the mean 
and standard deviation of this set of ratios for our data 
point and error bar. 

Scaling coefficients. When computing scaling coeffi¬ 
cients (see Appendix II C), we assume normal distri¬ 
butions on TTS for a given solver and set of 1000 in¬ 
stances (200 of each size). We then draw 1000 samples 
from each of the five distributions, and for each set of 
five samples we compute the slope of the best fit line 
ln(TTS) « a(a)+b(a)L. From these 1000 slopes we take 
the mean and standard deviation. Error bars in Fig. 3 
represent two standard deviations, in keeping close to the 
methods of Hen et al., who use 95% confidence intervals 
[!]• 


II. FURTHER DATA 

A. Problem hardness as a function of 
constraint-to-variable ratio 

In Section III we described Itay Hen’s original con¬ 
struction of frustrated loop instances, and offered a mod¬ 
ification. For convenience, we call the former HenFL in¬ 
stances, and the latter KingFL instances. As explored 
by Hen et al. [1], there is a clear easy-hard-easy pat¬ 
tern of hardness for all solvers as a increases from 0 to 
1 and beyond, with the hardest regime generally falling 
near the point a = 0.25. This is shown to correlate with 
frustration in the problem, measured by Hen et al. as 
the proportion of extant couplers that are frustrated in 
the planted ground state. For large a, systems tend to¬ 
wards ferromagnetism. For sufficiently small a, a typical 


system is simply a collection of disjoint frustrated loops, 
and therefore both highly degenerate and combinatorially 
trivial. In Fig. 8 we see the same qualitative dependence 
on a. 

In HenFL problems, the mean loop length is approx¬ 
imately 11. In KingFL problems it is approximately 9 
(see Fig. 9 for the instances studied). It is therefore not 
surprising that the hardest problems appear for slightly 
larger a in our results compared with the results of Hen 
et al. We remark that a well-yielded Chimera graph will 
have between 2n/5 and 3n edges, so in our data the hard¬ 
est problems arise when the expected number of loops 
containing a given coupler is near 1. 


B. Effect of range limitation and loop distribution 

Of fundamental importance to this work is the ques¬ 
tion of whether or not limiting the coupling range of 
frustrated loop instances makes them intrinsically easier. 
Here we give evidence that the difference in performance 
of the D-Wave processors here and in the paper of Hen 
et al. [ ] cannot be explained by our testbed being easier. 

Clearly the range-limited instances studied here are 
easier for the SR10-V6 D-Wave processor used in this 
work than the range-unlimited HenFL instances are for 
the ISI-V6 processor, and clearly the range-2 instances 
are easier than the range-3 instances for SR10-V6. For an 
idea of whether or not range-limited instances are com¬ 
binatorially easier, we appeal to HFS performance, since 
HFS is unaffected by coupling range and has no ther¬ 
mal component. Perhaps the most pertinent answer to 
this question is in the bottom-middle panel of Fig. 11. 
There we see that for the highest a, HFS finds range- 
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D-Wave, range 2 


HFS, range 2 


SAS, range 2 


Range-2 instances 


Range-3 instances 




Ca Cq Cq C7 Cs 


Problem size 



Ca O5 Cq C7 Cq 


Problem size 


—*t— a = 0.10 

— a = 0.15 
— a = 0.20 
— a = 0.25 
—h- a = 0.30 
—*- a = 0.35 
—m— a = 0.40 
—m— a = 0.45 
—n— a = 0.50 


D-Wave, range 3 


HFS, range 3 


SAS, range 3 
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HFS, range 00 SAS, range 00 




FIG. 9. Loop length in experimental testbed. Shown 
here are the mean and standard deviation of loop length for 
range-2 and range-3 instances. The distribution appears to 
converge to approximately length 9. This may be slightly 
different for more fully-yielded Chimera graphs. 


HFS, range oo 


SAS, range oo 



Problem size 



Problem size 


— a = 0.10 

— a = 0.15 
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—h— a = 0.45 
—h— a = 0.50 


FIG. 10. Median time to solution per size for range- 
unlimited instances. Shown here are data for HFS and 
SAS on range-unlimited instances; these plots are analogous 
to Fig. 1. 


FIG. 8. Time to solution vs. qubit-to-constraint ratio 

a. Shown is the median time to solution for the D-Wave pro¬ 
cessor (left), HFS (middle), and SAS (right) plotted against 
a for each problem size Cl for L £ {4, 5, 6, 7, 8}. 


2 instances consistently easier than range-oo instances, 
but this phenomenon weakens as a decreases. There is 
a simple intuitive reason for this: for any given coupling 
range limit R, if a is sufficiently large (but not so large 
that a random KingFL instance cannot be consistently 
constructed), the number of loops containing an edge is 
forced to be relatively consistent across the edges of the 
graph. Consequently, the system is more orderly and 
therefore more ferromagnetic - recall that at least 5/6 
of nonzero couplers in each ,/, are ferromagnetic. This 
intuition is corroborated by our HFS results. 


In Figures 12 and 13 we give HFS data on a more 
extensive set of inputs, arranged according to a. For each 
choice of L and a, the set contains 200 instances. Fig. 12 
suggests that we should not expect the range-unlimited 
KingFL instances on the SR10-V6 hardware graph to be 
any easier than the range-unlimited HenFL instances on 
the ISI-V6 hardware graph, if for each set we choose a 
to maximize median HFS time to solution. This takes 
into consideration the fact that ISI-V6 uses 503 qubits 
on Cs problems, while SR10-V6 uses only 467. Fig. 13 
suggests that for L < 8 and a < 0.3, we should not 
expect range-3 KingFL instances to be consistently easier 
than range-unlimited KingFL instances. For L < 8 and 
a < 0.2, we should not expect range-2 KingFL instances 
to be consistently easier than range-unlimited KingFL 
instances. In short, it appears that the difference between 
our results and the results of Hen et al. [1], especially the 
scaling advantage we see for the D-Wave processor on 
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D-Wave, range 2 vs. 3 HFS, range 2 vs. 3 


SAS, range 2 vs. 3 



D-Wave, range 3 vs. oo HFS, range 3 vs. oo 


SAS, range 3 vs. oo 



Problem size 


Problem size 


Problem size 



Problem size 


Problem size 


FIG. 11. Benefit of reduced coupling range. Here we 
show the median speedup arising from changing the coupling 
range for each solver. No range-oo data exists for the D-Wave 
processor. Positive slopes indicate increasing benefit from re¬ 
duction of precision range as problems get larger. Error bars 
represent one standard deviation from bootstrap samples. 


range-3 instances, cannot be attributed to the problems 
being fundamentally easier. 

For SAS, the issue of computational difficulty is con¬ 
volved with that of effective temperature. This is most 
troublesome for range-oo instances, where the energy 
scale (and relative temperature) varies significantly be¬ 
tween instances (see Fig. 14). Speedup for range-3 over 
range-2 problems for low a may indicate that the final 


inverse temperature /3/ is too low relative to the energy 
scale of the normalized Hamiltonian, whereas speedup 
for range-3 over range-2 problems for high a may indi¬ 
cate, in agreement with HFS data, that these problems 
are easier in range 2. Fig. 7 provides further insight into 
the question of easiness and temperature. 

Going back to Fig. 11, we emphasize that particularly 
for a < 0.3, which includes the hardest instances, the 
HFS speedup as coupling range decreases is small com¬ 
pared with both the speedup of the D-Wave processor 
over HFS and the speedup of the D-Wave processor be¬ 
tween range-2 and range-3 instances. For a < 0.25, 
range-3 instances appear to have very little structural 
“easiness” compared with range-oo instances. 

As further illustration of why this should be the case, 
Fig. 15 takes the range-oo KingFL testbed and plots the 
proportion of nonzero couplers exceeding a given range 
limit for each value of a. For a fixed a, the structural 
impact of imposing a coupler range limit of R diminishes 
quickly as R increases. In other words, in range-oo in¬ 
stances scaled to the interval [—1,1], the scaling factor 
is determined by the tail of the coupler value distribu¬ 
tion, which we expect to be insignificant to the overall 
combinatorial structure of the problem. 


C. Ruling out a thermal model for D-Wave 
performance 


In order to support a thermal model for performance 
scaling of the D-Wave processor, SAA data should pro¬ 
vide a good match with D-Wave processor data on both 
range-2 and range-3 instances at the same final inverse 
temperature /3/, for various choices of a. In other words, 
it should have predictive power [1 ]. Hen et al. found 
a good match between SAA and their D-Wave processor 
data at /3/ = 5 for certain values of a [1]. In Fig. 3 we can 
see the scaling coefficients for the D-Wave processor and 
for SAA with /3/ £ {3,4, 5}. The ratio of temperature to 
energy scale is similar for the D-Wave processor studied 
here and that studied by Hen et al. (~ 7% difference). 
For range-3 instances, Pf = 4 appears to be too high 
and /3/ = 3 is clearly too low. Although /3/ = 4 gives a 
better fit, its scaling does not match that of the D-Wave 
processor on low-a range-2 instances. 

Fig. 16 shows instance-wise scatter plots of success 
probabilities for the D-Wave processor and SAA at /3/ = 
4. It is clear from this data that the best correlation is 
poor when compared with the data of Hen et al., par¬ 
ticularly looking at each problem size individually, and 
that there is not a good fit between success probabili¬ 
ties on range-2 instances. These data fail to support the 
hypothesis that SAA provides a thermal model for per¬ 
formance scaling on frustrated loop instances. On the 
contrary, there are several pieces of evidence that in this 
context, analog control error causes the appearance of 
thermal behavior. First is the poor correlation compared 
to that found by Hen et al.; the data shown in Fig. 16 for 
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FIG. 12. Comparison of KingFL and HenFL instances on different hardware graphs. Shown here are median TTS 
for HFS on range-unlimited KingFL and HenFL instances. The KingFL instances are constructed on subgrids of a random Ci6 
graph with similar yield to the SR10-V6 processor (91%), while HenFL instances are constructed on subgrids of a random Ci6 
graph with similar yield to the ISI-V6 processor (98%). 


HFS, range 2 vs. oo 


HFS, range 3 vs. oo 
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FIG. 13. Benefit of reduced coupling range for HFS. Here we show median speedup arising from changing the coupling 
range for HFS on a broad set of inputs: a G {0.100, 0.125,..., 0.500} and L G {4,..., 14}. Instances are constructed on full Cl 
graphs. Error bars represent one standard deviation from bootstrap samples. 


range-3 instances represents the best visual fit between 
the D-Wave processor and SAA data out of all our exper¬ 
iments. Second is the fact that we do not find agreement 
at the same inverse temperature /3/ = 5. Third is given 
by Hen et al. ([1] Fig. 12), who show scaling coefficients 


for various choices of /3j, along with results on perturbed 
instances at /3f = 5. Their data suggests that perturb¬ 
ing the Hamiltonian has a similar effect to increasing the 
apparent temperature of SAA. 
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HenFL precision limits for a = 0.25 HenFL precision limits for a = 0.35 




KingFL precision limits for a = 0.25 
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Coupling range 
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FIG. 14. Distribution of coupling range for range-unlimited instances. Here we see the change in the distribution of 
coupling range for random HenFL and KingFL instances at a = 0.25 and a = 0.35. Each data series consists of 1000 randomly 
generated instances on a full Chimera graph Cl- Naturally, coupling range increases as a increases. KingFL instances have 
lower coupling range than corresponding HenFL instances, owing to their lower average loop length. 


Proportion of nonzero couplers exceeding a given range 



FIG. 15. Couplers exceeding a given range. Shown is the mean proportion of nonzero couplers in a Cs range-oo exceeding 
each range limit. For fixed a, the proportion of couplers exceeding the range limit R decreases superexponentially with R. 
Error bars represent one standard deviation from each data set of 200 instances. 


































Range 2, a = 0.25 


Range 2, a = 0.30 




FIG. 16. Success probability correlations between the D-Wave processor and SAA. Shown are instance-wise scatters 
for the most difficult problems (a £ {0.25,0.30}) for range-2 and range-3 instances. Instances receive colors according to their 
size. SAA has fif = 4. Correlation on the range-3 instances is weak compared to correlation found by Hen et al. [1], whereas 
correlation on range-2 instances, particularly for each problem size as a separate data set, is poor. 












